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ABSTRACT 

The Proteomics Standard Initiative Common QUery 
Interface (PSICQUIC) specification was created by 
the Human Proteome Organization Proteomics 
Standards Initiative (HUPO-PSI) to enable com- 
putational access to molecular-interaction data 
resources by means of a standard Web Service 
and query language. Currently providing >150 
million binary interaction evidences from 28 
servers globally, the PSICQUIC interface allows the 
concurrent search of multiple molecular-interaction 
information resources using a single query. Here, 
we present an extension of the PSICQUIC 
specification (version 1.3), which has been 
released to be compliant with the enhanced 
standards in molecular interactions. The new 
release also includes a new reference imple- 
mentation of the PSICQUIC server available to the 
data providers. It offers augmented web service 
capabilities and improves the user experience. 
PSICQUIC has been running for almost 5 years, 
with a user base growing from only 4 data providers 
to 28 (April 2013) allowing access to 151310109 
binary interactions. The power of this web service 
is shown in PSICQUIC View web application, an 
example of how to simultaneously query, browse 
and download results from the different PSICQUIC 



servers. This application is free and open to all users 
with no login requirement (http://www.ebi.ac.uk/ 
Tools/webservices/psicquic/view/main.xhtml). 



INTRODUCTION 

One of the main issues currently facing the scientific 
community is the integration of data generated by the 
many different instruments and software platforms used 
in high- throughput experiments. The Human Proteome 
Organization Proteomics Standards Initiative (HUPO- 
PSI) was founded with the aim of developing standards 
to unify the diversity of data produced by proteomics 
experiments (1). In 2004, the Molecular Interaction (MI) 
group of the PSI jointly published a community- standard 
XML data model for the representation and exchange of 
protein-interaction data (2). The same work group 
subsequently published the Minimum Information about 
a Molecular Interaction Experiment (MIMIx) (3) guide- 
lines, defining a list of parameters to be supplied when 
describing experimental molecular-interaction data in 
a journal publication. A number of public interaction 
databases have gone still further, forming the 
International Molecular Exchange consortium (IMEx) 
(4,5) to facilitate assembly of a single non-redundant set 
of consistently curated protein-interaction data. This 
original XML model was later further refined and in 2007 
was supplemented by a simple tab delimited format, PSI- 
MITAB (6). These formats have been widely adopted by 
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molecular-interaction databases (7), enabling the initial 
development of the Proteomics Standard Initiative 
Common QUery InterfaCe (PSICQUIC) service. 

The PSICQUIC (8) specification is defined by means of 
a Web Service, with a clean-cut set of methods that have 
as input a query in MIQL (Molecular Interactions Query 
Language). The initial release of PSICQUIC supported 
only the very limited set of 15 fields of PSI-MITAB 2.5, 
which represented a simplistic description of molecular- 
interaction data. Following the development of extended 
PSI-MITAB formats (6) (version 2.6 and more recently 
2.7), the number of fields has been increased to be fully 
compHant with MIMIx and enable presentation of IMEx- 
standard curated data (4). All these changes have resulted 
in the development of a new PSICQUIC specification 
encompassing extensions to the MIQL query language 
and a completely new implementation of the PSICQUIC 
reference server. Accompanying documentation helps data 
suppHers to easily migrate to the new, more efficient and 
feature-rich server. 



MATERIALS AND METHODS 

PSICQUIC specification 

PSICQUIC defines a minimum set of standard SOAP and 
REST methods to be implemented by every molecular- 
interaction provider. These methods accept a MIQL 
query as input and return, as output, molecular- 
interaction information in one of the standard formats 
(PSI-XML 2.5, PSI-MITAB 2.5, PSI-MITAB 2.6, PSI- 
MITAB 2.7). 

The PSICQUIC SOAP-based services are defined 
through a standard WSDL specification that all 
implementations must comply with. This definition has 
remained stable since PSICQUIC specification version 
1.1; however, the capability to return the new PSI- 
MITAB versions has been added. Among the various 
methods described in the specification, the most flexible 
one is the getByQuery. It can be used to perform rather 
complex queries, as it accepts all the fields defined in 
MIQL. The results are returned, as specified by the user, 
in one of the standard formats (PSI-XML or PSI- 
MITAB). The remaining SOAP methods do not directly 
use MIQL. A summary of the main methods is shown in 
Table 1 and Table 2, and further information about their 
different options is available in the PSICQUIC 
specification for SOAP available in the PSICQUIC 
project web (http://code.google.eom/p/psicquic/wiki/ 
PsicquicSpec_l _3_Soap) . 

In addition to the SOAP-based protocol, PSICQUIC 
also implements a set of RESTful services to make it 
possible to retrieve data over HTTP using simple URLs. 
This protocol can also be used to access molecular 
interactions through scripting languages and supports 
other common output formats such as Resource 
Description Framework (RDF), Biological Pathway 
Exchange (BioPAX) and extensible Graph Markup and 
Modeling Language (XGMML). It should be noticed that 
these new formats have existed since PSICQUIC 
specification version 1.2, and they are only available 



Table 1. Summary of the main available methods in SOAP 



Method name 



Description 



getByQuery 
getBylnteractor 



getBylnteractorList 

getBylnteraction 
getBylnteractionList 



Retrieve data using an MIQL query 
Retrieve data using a specific participant 
identifier (equivalent MIQL field 
identifier) 
Retrieve data using a Hst of participant 
identifiers. This method can be used to 
retrieve interactions where the two or 
more participants passed as arguments 
are found. To do so, set the operand 
to AND. 

Retrieve a specific interaction using its 
identifier (equivalent MIQL field 
interaction_id) 

Retrieve a Hst of interactions, using the 
identifiers 



Table 2. SOAP methods to retrieve information about the service 
itself (metadata) 



Method name 



Description 



getSupportedReturn 

Types 
getVersion 
getProperties 



getProperty 



Returns the hst of possible formats for 

the retrieved data 
Gets the version of the service 
Retrieve a Hst of the property objects 

defined in a service by the provider. 

Each property object have a key and a 

value 

Retrieve a property from the service 



through the REST service. As in SOAP, there is an 
ample set of the methods to choose from. Figure 1 
demonstrates how to access the information by means of 
HTTP GET requests. A template URL describes the main 
methods with the different options and the outputs. In the 
PSICQUIC specification for REST available in (http:// 
code . google .com/p /psicquic /wiki /PsicquicSpec_ 1 _3_ 
Rest), a more extensive version of the methods and 
options is presented. 

MIQL 

The main input to the PSICQUIC web service is a query 
written in the MIQL query language. MIQL defines a set 
of standard fields to query molecular-interaction data, 
extending the syntax of the Apache Lucene query 
language on which it is based. 

MIQL has also been updated with the new data fields; 
these new fields allow users to filter (or query) molecular 
interactions with novel criteria that adjust the result to 
their needs and reduce post-processing steps. The retrieval 
of information that was previously unavailable has thus 
been enabled. For example, in the new PSI-MITAB 2.6, 
the 'complex expansion' field is introduced. Thanks to this 
field, and with the new PSICQUIC service, the user now is 
able to distinguish the results that come from a original 
binary interaction or a binary pair resulting from the 
expansion of a n-ary interaction with one of the expansion 
methods available (spoke expansion, bipartite expansion 
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Base URL returned by the Registry 
i 



User defined query A 

Different query may be used 
depending on chosen <METHOD> 

JL 



1 I 1 

<PSICQUIC URL>/<VERSION>/search/<METHOD>/<QUERy>?<PARAMETERS> 



L 



n — 

Choose from: 

\/1.0 

vl.l 

vl.2 

VLB 

current 



J 



Choose from 
interactor 
interaction 
query 



J 



property 
properties 
formats 
version 



3 
3 



L 



I Data I 
I Retrieval I 



— n 

Choose from: 
firstResult 
maxResults 
format 



J 



xml25 
tab25 
count 



xgmml 
biopax 
rdf-xml 

rdf-xml-abbrev 

rdf-n3 

rdf-turtle 



tab26 
tab27 



3 



Figure 1. Structure of the URL to fetch data from PSICQUIC service. 



Table 3. Evolution of PSI-MITAB format 



PSI-MITAB 2.5 (15 cols) 


PSI-MITAB 2.6 (+21 cols) 


PSI-MITAB 2.7 (+6 cols) 


ID(s) interactor A & B 


Experimental role(s) A &B 


Features A & B 


Alt. ID(s) interactor A & B 


Biological role(s) A & B 


Stoichiometry A & B 


AHas(es) interactor A & B 


Properties (CrossReference) A & B 


Participant detection method A & B 


Interaction detection method(s) 


Type(s) of interactors A & B 




PubUcation 1st author(s) 


Host organism 




PubHcation Identifier(s) 


Expansion method(s) 




Taxid interactor A & B 


Annotations A & B 




Interaction type(s) 


Parameters 




Source database(s) 


Creation/update date 




Interaction identifier(s) 


Checksums A, B & interaction 




Confidence value(s) 


Negative 





or matrix expansion); distinguishing this information was 
previously impossible. Adding the 'stoichiometry' field 
(included in PSI-MITAB 2.7) allows the retrieval of 
information about intra-molecular interactions and with 
the inclusion of the 'features' field, PSICQUIC is able to 
provide, for the first time, fully compliant MIMIx 
information. 

In the updated PSI-MITAB formats, some information 
has been reallocated to the new columns. This removed 
some previously existing inconsistencies and made the 
records more accurate and easier to access through a 
MIQL query. See PSICQUIC extension for MIQL 
(http : //code . google .com/p /psicquic /wiki /MiqlReference27) 
for a detailed description of the additional fields. 

Reference implementation 

The open-source reference implementation described in 
this article has been wholly rewritten independently from 



the original (3) but remains backwards compatible with 
the previous versions of all the protocols. This new 
PSICQUIC service is based on Apache Solr indexing 
software (http://lucene.apache.org/solr/), which is a web 
appHcation built on top of Apache Lucene technology. 
From the data provider perspective, the new open- 
source reference implementation of PSICQUIC allows a 
local PSICQUIC server to be easily set up and loaded with 
data provided as a valid PSI-MITAB file. It supports the 
original 15-column PSI-MITAB 2.5 as well as newer, PSI- 
MITAB 2.6 and PSI-MITAB 2.7 formats, with 36 and 42 
columns respectively (see Table 3). 

In addition to introducing support for the new PSI- 
MITAB versions and the MIQL extension, extensive 
restructuring of the code resulted in improved response 
time of the server. It also removed the restriction on the 
number of interactions that can be exported in the 
XGMML format used in Cytoscape (9,10), which 
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PROVIDERS 









PSICQUIC WEB SERVICE 






) 









PSICQUIC WEBSERVER 




Figure 2. Dataflow followed by the reference implementation from its 
origin in the molecular-interaction databases to the end user through 
PSICQUIC. 



previously existed in the REST protocol (sending small 
chunks of interactions until the file is completely 
transmitted instead of truncating it as it was before). All 
these improvements enhance the web service and the 
concurrent search of multiple molecular-interaction 
databases independently of the different clients. 

Server deployment 

The reference implementation source code can be 
downloaded from the PSICQUIC Google project 
repository (svn co http :/ /psicquic . google 
code . com/svn/ tags/psicquic-solr-ws-1 .3.8). 
It includes a JAVA class to create the index from the PSI- 
MITAB file and a script that can be used to easily start the 
indexing process (bash indexMitab . sh /path/to/ 
mitab-file solr-index-directory). The solr- 
index directory will contain the index, solr 
configuration files, solr schema and the solr.war file 
mandatory to run the solr application. More detailed 
information and other options to deploy a PSICQUIC 
server is available on the PSICQUIC website (https:// 
code. google. com/p/psicquic/wiki/HowToInstallPsicquic 
Solr). In Figure 2 different elements required to build a 



Query: 



J NOT Field: / All 

Interactor id (Ex: P74S65) 



Click on the links below to displ. 
Use tlie cineck boxes to include 

Q APIDi^) 

Q [^ BIND ig -70 

Q ^BirKjingDB[§) -0 

Q ^ BioGrid i^ -339 

Q 0ChEMBL[^ -0 

O 0DIP|5) -0 

Q DruQBank ^ 

Q [^ GeneMANIA \S> - 1 2,382 




Press Search and query all the 
services at the same time 



Interactor alias (Ex: KHDRBSl) 



Interaclor Id or alias 
Author [Ex: scott} 
Publication id (Ex: L0S37477) 
Organism (Ex: human) 
Interaction type (Ex: phvsical association) 
Interaction detection method (Ex; pull down) 
Interaction Id (Ex: EBI-761050) 
Biological roles CEx: enzyme) 
Interactor type (Ex: protein) 
Interaclor xref (Ex: CO:0003S24) 
Interaction xref (Ex: nuclear pore) 
Interaction annotation (Ex: imex curation) 
Last update date (Ex: lYYYYMMDDTO YYYYMMDD)) 
Negative interaction (Ex: true) 
Complex expansion (Ex: spoke) 
Interactor feature (Ex: binding site) 
Interactor identification metliod (Ex: western blot) 



Select the service to study the results in detail 



Write the MIQL query 
(i.e. identifier:lsm3) 



Table Graph 



Cytoacape Graph 




Download or cluster the results 



I Cluster this query_ I 



Q DrugBanl< 

Q [^ GeneMANIA ig - 1 2,362 

IntAct 



Q ^lUlatrixDBi^i -0 
O gfMBIrvfoi^ -0 
Q f^ MINT igJ -104 



IntAct Download 



Select a format to export the resultsof the query. 



Graph 



Show/Hide Columns I Download... 





U 

molecule 


U 

molecule 
B 


AllBSfiS 

molecule 


AllBUt 

molecule 
B 


dpedumot 


1 


P57743 


P57743 


LSM3 
I+l 


LSM3 
I+l 


Saccharomyn 


2 


peaato 


095777 


LSM3 
1±1 


NAA38 


Homo sapiens 


3 


P57743 


006677 


LSU3 

bi 


SWA2 
I+l 


Saccharomfya 


4 


P57743 


P253Q3 


LSU3 

Id 


SCJ1 

I+l 


SaccharDmycK 



* Download: / 5electOne 

P5I-MIXML 2.5.4 



P5I-MITAB2.5 
P5I-MITAB 2.6 
P5I-MITAB 2.7 
BioPAX 

XCMML format, used in Cytoscape 
RDF in XML 

Less verbose XML-RDF 
RDf using N3 notation 
RDr using Turtle notation 
Just the total count 
BioPAX level 2 
Biopax level 3 



Choose between the table or the graph view 



Figure 3. PSICQUIC View is a client for PSICQUIC services in which by formulating only one query fetches all the relevant molecular interactions 
available in the registered services. After the search, the user can choose from studying the results in more detail, viewing the interaction network, 
downloading or clustering the results. 



Nucleic Acids Research, 2013, Vol. 41, Web Server issue W605 



A Mandatory 
i 



Optional 
A 



/ \ I \ 

<REGISTRY_URL>?action=<ACTION>&naitie=<NAME>&f ormat=<FORMAT>&restricted= [y | n] 



L 



J 



T 

Choose from: 
STATUS 
ACTIVE 
INACTIVE 



L 



J 



PSICQUIC 
Service Name 



T 

Choose from: 

xml 

txt 

count 



Figure 4. Structure of the URL to fetch data from PSICQUIC registry. 



PSICQUIC service from an interaction database are 
shown for clarification of this process. Solr indexing will 
enable the development of facihties such as visualization 
of statistical data through faceting, indexing from PSI- 
XML and data sorting. In addition to using the default 
implementation presented, providers can also implement 
their own systems to pubHsh interactions as long as they 
meet the PSICQUIC specifications (http://code. google. 
com/p/psicquic/wiki/PsicquicSpecification). 

PSICQUIC clients 

In addition to using the services directly from the browser 
(in the case of REST) or, alternatively, create a custom 
client, there are several applications at users' disposal for 
querying the web services programmatically. The 
PSICQUIC project site (http://psicquic.googlecode.com) 
offers open-source libraries for working with the different 
standards, JAVA clients to access the web services (http:// 
code.google.com/p/psicquic/wiki/JavaClient), code examples 
for accessing from Perl (http://code.google.eom/p/psicquic/ 
wiki/PerlCodeSamples) and other scripts in Python (http:// 
code.google.com/p/psicquic/wiki/PythonCodeSamples) and 
help for broad use cases. Important clients include the 
molecular interactions cluster (http://code.google.com/ 
p/micluster), the PSICQUIC Client Plugin for Cytoscape 
(http://apps.cytoscape.org/apps/psicquicuniversalclient) or the 
PSICQUIC View (http://www.ebi.ac.uk/Tools/webservices/ 
psicquic/view/main.xhtml). See Figure 3. 

PSICQUIC registry 

Users are expected to obtain the PSICQUIC web service 
SOAP or REST URLs by means of querying the 
PSICQUIC Registry. In addition to providing the 
necessary URLs, the registry is itself a REST web 
service, offering data on the number of interactions per 
service, the status of each service, a statement as to 
whether the data are restricted or not, the version of the 
software used and a small description of the type of service 
given by means of tags. The PSICQUIC registry is 
currently hosted at the European Bioinformatics 
Institute (http://www.ebi.ac.uk/Tools/webservices/psicq 
uic/registry/registry?action = STATUS). Figure 4 explains 
how to retrieve the information from the registry through 
HTTP. More information about the PSICQUIC registry is 
available at the PSICUIC project site (http://code. google. 
com/p/psicquic/wiki/Registry). 



DISCUSSION 

In <5 years since its original implementation, PSICQUIC 
has grown from 4 to 28 providers supplying > 1 50 milHon 
interactions, with additional services preparing to join. 
With the new reference implementation, we open the 
door to the additional new features such as sorting by 
different criteria (for example, the confidence score of 
the interactions) or faceting to retrieve statistics. Longer- 
term plans include direct indexing of the PSI-XML data to 
allow processing of the molecular-interaction data 
described in the original PSI-XML files, thus avoiding 
the currently necessary, lossy conversions between PSI- 
XML and PSI-MITAB formats. This, in turn, will 
enable the querying and retrieval of n-ary interactions 
rather than only binary pairs. 
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